Categories

Versions

Filter Attributes with Missing Values (Operator Toolbox)

Synopsis

This operator filters attributes with missing values.

Description

This operator filters attributes with missing values. The filter method can be defined with the parameter filter method. Possibilities are: all attributes are kept; only attributes with a maximum number of missing values or a maximum relative number of missing values are kept; all attributes with at least one non-missing values are kept. The thresholds for the absolute and relative number can be defined by the parameters maximum number of missings and maximum relative number of missings.

Special attributes are ignored and kept by default. If the parameter include special attributes is selected, the filter is also applied to special attributes.

Input

  • example set (Data Table)

    Input ExampleSet which is filtered.

Output

  • filtered example set (Data Table)

    Filtered ExampleSet with only the attributes which fulfill the filter.

  • original (Data Table)

    The original ExampleSet.

Parameters

  • filter_method

    This parameter allows you to select the filter method; the method you want to use to filter attributes with missing values. It has the following options:

    • keep all: This option keeps all attributes of the ExampleSet, no attributes are removed.
    • one or more non-missing: All attributes are kept, which have at least one non-missing value. Attributes which have only missing values are removed.
    • maximum number missing: All attributes which have a maximum number of missing values are kept. The maximum number can be specified by the parameter maximum number of missings.
    • maximum relative number missing: All attributes which have a maximum relative number of missing values are kept. The relative maximum number can be specified by the parameter maximum relative number of missings.
    Range:
  • maximum_number_of_missings

    Only attributes are kept where the number of missing values is smaller or equal to this parameter.

    Range:
  • maximum_relative_number_of_missings

    Only attributes are kept where the relative number of missing values is smaller or equal to this parameter.

    Range:
  • invert_selection

    If selected, the filter is inverted, all attributes which fulfill the filter condition are removed, the others are kept. Special attributes are kept by default. If include special attributes is selected, they are included in the inverted filter.

    Range:
  • include_special_attributes

    Special attributes are attributes with special roles. These are: id, label, prediction, cluster, weight and batch. Also custom roles can be assigned to attributes. By default all special attributes are kept. If this parameter is set to true, the filter is also applied on special attributes.

    Range:

Tutorial Processes

Demonstration of different filter methods